skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Cheng, Albert"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Schlick, Tamar (Ed.)
    Dictionary learning (DL), implemented via matrix factorization (MF), is commonly used in computational biology to tackle ubiquitous clustering problems. The method is favored due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability in terms of real biological data. Additionally, they are not optimized for graph-structured data and hence often fail to handle them in a scalable manner. In order to address these limitations, we propose a novel DL algorithm calledonline convex network dictionary learning(online cvxNDL). Unlike classical DL algorithms, online cvxNDL is implemented via MF and designed to handle extremely large datasets by virtue of its online nature. Importantly, it enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to data with a network structure by incorporating specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data with the goal of identifying important long-range interaction patterns (long-range dictionary elements). ChIA-Drop probes higher-order interactions, and produces data in the form of hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis has the objective of creating an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Through the use of dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. To accomplish the task at hand, we focus on RNAPII-enriched ChIA-Drop data fromDrosophila MelanogasterS2 cell lines. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL (MF) methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology (GO) enrichment analysis and perform multiple RNA coexpression studies. 
    more » « less
  2. Abstract Three-dimensional (3D) structures of the genome are dynamic, heterogeneous and functionally important. Live cell imaging has become the leading method for chromatin dynamics tracking. However, existing CRISPR- and TALE-based genomic labeling techniques have been hampered by laborious protocols and are ineffective in labeling non-repetitive sequences. Here, we report a versatile CRISPR/Casilio-based imaging method that allows for a nonrepetitive genomic locus to be labeled using one guide RNA. We construct Casilio dual-color probes to visualize the dynamic interactions of DNA elements in single live cells in the presence or absence of the cohesin subunit RAD21. Using a three-color palette, we track the dynamic 3D locations of multiple reference points along a chromatin loop. Casilio imaging reveals intercellular heterogeneity and interallelic asynchrony in chromatin interaction dynamics, underscoring the importance of studying genome structures in 4D. 
    more » « less
  3. Abstract Oncogenic extrachromosomal DNA elements (ecDNA) play an important role in tumor evolution, but our understanding of ecDNA biology is limited. We determined the distribution of single-cell ecDNA copy number across patient tissues and cell line models and observed how cell-to-cell ecDNA frequency varies greatly. The exceptional intratumoral heterogeneity of ecDNA suggested ecDNA-specific replication and propagation mechanisms. To evaluate the transfer of ecDNA genetic material from parental to offspring cells during mitosis, we established the CRISPR-based ecTag method. ecTag leverages ecDNA-specific breakpoint sequences to tag ecDNA with fluorescent markers in living cells. Applying ecTag during mitosis revealed disjointed ecDNA inheritance patterns, enabling rapid ecDNA accumulation in individual cells. After mitosis, ecDNAs clustered into ecDNA hubs, and ecDNA hubs colocalized with RNA polymerase II, promoting transcription of cargo oncogenes. Our observations provide direct evidence for uneven segregation of ecDNA and shed new light on mechanisms through which ecDNAs contribute to oncogenesis. Significance: ecDNAs are vehicles for oncogene amplification. The circular nature of ecDNA affords unique properties, such as mobility and ecDNA-specific replication and segregation behavior. We uncovered fundamental ecDNA properties by tracking ecDNAs in live cells, highlighting uneven and random segregation and ecDNA hubs that drive cargo gene transcription. See related commentary by Henssen, p. 293. This article is highlighted in the In This Issue feature, p. 275 
    more » « less